This analysis addresses three major knowledge gaps for ectomycorrhizal fungi (EMF) in Canada:
The analysis combines tree species distribution data, mycorrhizal databases, and EMF sequence data to quantify these knowledge gaps across Canadian ecosystems. We examine EMF diversity at multiple taxonomic levels including sequence-based taxa (Other_ID and UNITE_ID), genera, and species.
The analysis follows a structured pipeline implemented across multiple R scripts:
01_setup.R): Environment
configuration and library loading02_download_data.R):
Acquisition of external datasets03_process_spatial.R): Geographic data preparation04_process_fungal.R): EMF and mycorrhizal data
integration05_calculate_metrics.R): Richness and coverage
analysis06_create_maps.R): Map
creation and spatial visualizationEMF diversity is analyzed at four taxonomic levels:
| Metric | Value |
|---|---|
| Total EMF sequence records | 6815 |
| Total unique sampling locations (all data) | 815 |
| Unique locations with EMF data | 367 |
| Unique Other_ID values | 1034 |
| Unique UNITE_ID values | 255 |
| Unique EMF genera | 139 |
| Unique EMF species | 807 |
| Canadian EMF host tree species | 99 |
| Host species with sequence data | 11 |
| Percentage of host species with data | 11.11% |
| Unique host genera with data | 7 |
| Percentage of host genera with data | 21.21% |
| Total ecoregions | 218 |
| Ecoregions with potential EMF habitat | 196 |
| Ecoregions sampled for EMF | 51 |
| Percentage of habitat ecoregions unsampled | 71.02% |
| Knowledge Shortfall | Description | Key Finding |
|---|---|---|
| Eltonian Shortfall | Gap in knowledge about species interactions | 11.11% of Canadian EMF host tree species have associated fungal sequence data; 807 unique EMF species identified |
| Hutchinsonian Shortfall | Gap in knowledge about species’ abiotic niches | 71.02% of habitat ecoregions remain unsampled |
| Wallacean Shortfall | Gap in knowledge about species distributions | 815 total unique sampling locations; 367 locations with EMF sequence data; 1034 unique Other_ID and 255 unique UNITE_ID sequence-based taxa |
| Taxonomic Level | Unique Taxa | Mean Locations/Taxon | Max Locations/Taxon | Host Species | Mean Taxa/Host |
|---|---|---|---|---|---|
| Other_ID | 1034 | 1.74 | 11 | 16 | 64.56 |
| UNITE_ID | 255 | 1.51 | 11 | 15 | 17.73 |
| Genus | 139 | 16.93 | 196 | 16 | 20.31 |
| Species | 807 | 5.85 | 94 | 16 | 53.88 |
| Taxonomic Level | Unique Taxa | Mean Locations | Max Locations | Min Locations |
|---|---|---|---|---|
| Other_ID | 1034 | 1.74 | 11 | 1 |
| UNITE_ID | 255 | 1.51 | 11 | 1 |
| Genus | 139 | 16.93 | 196 | 1 |
| Species | 807 | 5.85 | 94 | 1 |
| Taxonomic Level | Host Species | EMF Taxa | Mean Taxa/Host | Max Taxa/Host | Mean Hosts/Taxon |
|---|---|---|---|---|---|
| Other_ID | 16 | 752 | 64.56 | 266 | 1.37 |
| UNITE_ID | 15 | 206 | 17.73 | 97 | 1.29 |
| Genus | 16 | 95 | 20.31 | 53 | 3.42 |
| Species | 16 | 477 | 53.88 | 184 | 1.81 |
| Spatial Metric | Value |
|---|---|
| Grid cells with species data | 1600 |
| Cells with zero EMF coverage | 330 |
| Cells with EMF data | 1270 |
| Mean EMF coverage proportion | 0.229 |
| Maximum EMF coverage | 1 |
| Range of coverage | 0 - 1 |
Distribution of EMF sampling locations across Canadian ecoregions. Points are colored by data source and aggregated within 1 km radius.
Number of tree species within each 1° × 1° grid cell that have associated EMF sequence data somewhere in Canada.
Proportion of tree species per 1° × 1° grid cell that have associated EMF sequence data somewhere in Canada
Bivariate map showing both the number (richness) of EMF host tree species and the proportion of host species in the grid cell with EMF records somewhere in Canada.
| Species | Unique Locations | Total Records |
|---|---|---|
| Lachnum_virgineum | 94 | 94 |
| Cenococcum_geophilum | 60 | 60 |
| Cortinarius_decipiens | 56 | 106 |
| Tylospora_asterophora | 54 | 89 |
| Cortinarius_croceus | 52 | 56 |
| Geopyxis_carbonaria | 50 | 51 |
| Tomentella_stuposa | 48 | 52 |
| Tomentella | 42 | 154 |
| Cortinarius_casimiri | 36 | 70 |
| Mycena_aetites | 35 | 35 |
| Genus | Unique Locations | Total Records |
|---|---|---|
| Cortinarius | 196 | 1708 |
| Tomentella | 105 | 359 |
| Lachnum | 96 | 115 |
| Tricholoma | 94 | 251 |
| Inocybe | 88 | 420 |
| Russula | 81 | 361 |
| Mycena | 80 | 187 |
| Hebeloma | 63 | 164 |
| Cenococcum | 60 | 60 |
| Peziza | 58 | 72 |
| UNITE_ID | Unique Locations | Total Records |
|---|---|---|
| SH1132005.09FU | 11 | 26 |
| SH1648320.08FU | 10 | 14 |
| SH1571570.08FU | 9 | 21 |
| SH0924970.09FU | 8 | 8 |
| SH1156627.09FU | 8 | 13 |
| SH1563787.08FU | 7 | 28 |
| SH1155880.09FU | 6 | 6 |
| SH1067874.09FU | 5 | 7 |
| SH1138802.09FU | 5 | 8 |
| SH1295836.09FU | 5 | 5 |
| Other_ID | Unique Locations | Total Records |
|---|---|---|
| UDB0780173 | 11 | 19 |
| UDB027969 | 10 | 16 |
| UDB018564 | 9 | 17 |
| UDB034984 | 9 | 54 |
| UDB0754284 | 9 | 16 |
| UDB001746 | 8 | 16 |
| UDB004960 | 8 | 9 |
| UDB016650 | 8 | 8 |
| UDB023537 | 8 | 16 |
| UDB026053 | 8 | 17 |
| Host Species | Unique EMF Species | Total Records |
|---|---|---|
| Pseudotsuga_menziesii | 184 | 381 |
| Pinus_albicaulis | 108 | 325 |
| Picea_engelmannii | 104 | 149 |
| Populus_tremuloides | 84 | 91 |
| Tsuga_heterophylla | 59 | 786 |
| Pinus_contorta | 55 | 258 |
| Populus_sp | 49 | 190 |
| Pinus_banksiana | 47 | 123 |
| Salix_arctica | 46 | 135 |
| Dryas_integrifolia | 37 | 105 |
| EMF Species | Unique Hosts | Total Records |
|---|---|---|
| Tomentella | 13 | 119 |
| Meliniomyces | 10 | 101 |
| Cortinarius_decipiens | 9 | 62 |
| Sebacina | 9 | 33 |
| Cortinarius | 8 | 105 |
| Sebacina_dimitica | 8 | 29 |
| Thelephora_terrestris | 8 | 24 |
| Tylospora_asterophora | 8 | 40 |
| Piloderma_sphaerosporum | 7 | 55 |
| Tomentella_cinereoumbrina | 7 | 12 |
This comprehensive analysis reveals key knowledge shortfalls in EMF research across Canada:
Eltonian Shortfall (Species Interactions):
Hutchinsonian Shortfall (Environmental Niches):
Wallacean Shortfall (Species Distributions):
All analysis outputs are available in the project’s
outdata/ directory:
| File Name | Description | Available | |
|---|---|---|---|
| comprehensive_emf_summary.csv | comprehensive_emf_summary.csv | Summary statistics across all taxonomic levels | TRUE |
| locations_per_other_id.csv | locations_per_other_id.csv | Sampling locations per Other_ID sequence-based taxon | TRUE |
| locations_per_unite_id.csv | locations_per_unite_id.csv | Sampling locations per UNITE_ID sequence-based taxon | TRUE |
| locations_per_genus.csv | locations_per_genus.csv | Sampling locations per EMF genus | TRUE |
| locations_per_species.csv | locations_per_species.csv | Sampling locations per EMF species | TRUE |
| host_taxon_matrix_species.csv | host_taxon_matrix_species.csv | Host-species association matrix | TRUE |
| emf_taxa_per_host_species.csv | emf_taxa_per_host_species.csv | EMF species counts per host species | TRUE |
| hosts_per_emf_taxon_species.csv | hosts_per_emf_taxon_species.csv | Host species counts per EMF species | TRUE |
| Component | Details |
|---|---|
| R Version | 4.4.2 |
| Platform | x86_64-apple-darwin20 |
| Running under | macOS Ventura 13.7.6 |
| Locale | en_US.UTF-8/en_CA.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 |
Analysis completed on 2025-07-17
For questions about this analysis, contact the corresponding
authors.